Mirror Descent Search and Acceleration
نویسندگان
چکیده
In recent years, attention has been focused on the relationship between black box optimization and reinforcement learning. Black box optimization is a framework for the problem of finding the input that optimizes the output represented by an unknown function. Reinforcement learning, by contrast, is a framework for finding a policy to optimize the expected cumulative reward from trial and error. In this research, we propose a reinforcement learning algorithm based on the mirror descent method, which is general optimization algorithm. The proposed method is called Mirror Descent Search. The contribution of this research is roughly twofold. First, an extension method for mirror descent can be applied to reinforcement learning and such a method is here considered. Second, the relationship between existing reinforcement learning algorithms is clarified. Based on these, we propose Mirror Descent Search and derivative methods. The experimental results show that learning with the proposed method progresses faster.
منابع مشابه
On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging
This paper considers stochastic subgradient mirror-descent method for solving constrained convex minimization problems. In particular, a stochastic subgradient mirror-descent method with weighted iterate-averaging is investigated and its per-iterate convergence rate is analyzed. The novel part of the approach is in the choice of weights that are used to construct the averages. Through the use o...
متن کاملA Free Line Search Steepest Descent Method for Solving Unconstrained Optimization Problems
In this paper, we solve unconstrained optimization problem using a free line search steepest descent method. First, we propose a double parameter scaled quasi Newton formula for calculating an approximation of the Hessian matrix. The approximation obtained from this formula is a positive definite matrix that is satisfied in the standard secant relation. We also show that the largest eigen value...
متن کاملAcceleration and Averaging in Stochastic Descent Dynamics
[1] Nemirovski and Yudin. Problems Complexity and Method Efficiency in Optimization. Wiley-Interscience series in discrete mathematics. Wiley, 1983. [2] W. Krichene, A. Bayen and P. Bartlett. Accelerated Mirror Descent in Continuous and Discrete Time. NIPS 2015. [3] W. Su, S. Boyd and E. Candes. A differential equation for modeling Nesterov's accelerated gradient method: theory and insights. NI...
متن کاملAccelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method
We provide a novel accelerated first-order method that achieves the asymptotically optimal con-vergence rate for smooth functions in the first-order oracle model. To this day, Nesterov’s AcceleratedGradient Descent (agd) and variations thereof were the only methods achieving acceleration in thisstandard blackbox model. In contrast, our algorithm is significantly different from a...
متن کاملExtensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property
Using search directions of a recent class of three--term conjugate gradient methods, modified versions of the Hestenes-Stiefel and Polak-Ribiere-Polyak methods are proposed which satisfy the sufficient descent condition. The methods are shown to be globally convergent when the line search fulfills the (strong) Wolfe conditions. Numerical experiments are done on a set of CUTEr unconstrained opti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1709.02535 شماره
صفحات -
تاریخ انتشار 2017